3 research outputs found

    Explicit web search result diversification

    Get PDF
    Queries submitted to a web search engine are typically short and often ambiguous. With the enormous size of the Web, a misunderstanding of the information need underlying an ambiguous query can misguide the search engine, ultimately leading the user to abandon the originally submitted query. In order to overcome this problem, a sensible approach is to diversify the documents retrieved for the user's query. As a result, the likelihood that at least one of these documents will satisfy the user's actual information need is increased. In this thesis, we argue that an ambiguous query should be seen as representing not one, but multiple information needs. Based upon this premise, we propose xQuAD---Explicit Query Aspect Diversification, a novel probabilistic framework for search result diversification. In particular, the xQuAD framework naturally models several dimensions of the search result diversification problem in a principled yet practical manner. To this end, the framework represents the possible information needs underlying a query as a set of keyword-based sub-queries. Moreover, xQuAD accounts for the overall coverage of each retrieved document with respect to the identified sub-queries, so as to rank highly diverse documents first. In addition, it accounts for how well each sub-query is covered by the other retrieved documents, so as to promote novelty---and hence penalise redundancy---in the ranking. The framework also models the importance of each of the identified sub-queries, so as to appropriately cater for the interests of the user population when diversifying the retrieved documents. Finally, since not all queries are equally ambiguous, the xQuAD framework caters for the ambiguity level of different queries, so as to appropriately trade-off relevance for diversity on a per-query basis. The xQuAD framework is general and can be used to instantiate several diversification models, including the most prominent models described in the literature. In particular, within xQuAD, each of the aforementioned dimensions of the search result diversification problem can be tackled in a variety of ways. In this thesis, as additional contributions besides the xQuAD framework, we introduce novel machine learning approaches for addressing each of these dimensions. These include a learning to rank approach for identifying effective sub-queries as query suggestions mined from a query log, an intent-aware approach for choosing the ranking models most likely to be effective for estimating the coverage and novelty of multiple documents with respect to a sub-query, and a selective approach for automatically predicting how much to diversify the documents retrieved for each individual query. In addition, we perform the first empirical analysis of the role of novelty as a diversification strategy for web search. As demonstrated throughout this thesis, the principles underlying the xQuAD framework are general, sound, and effective. In particular, to validate the contributions of this thesis, we thoroughly assess the effectiveness of xQuAD under the standard experimentation paradigm provided by the diversity task of the TREC 2009, 2010, and 2011 Web tracks. The results of this investigation demonstrate the effectiveness of our proposed framework. Indeed, xQuAD attains consistent and significant improvements in comparison to the most effective diversification approaches in the literature, and across a range of experimental conditions, comprising multiple input rankings, multiple sub-query generation and coverage estimation mechanisms, as well as queries with multiple levels of ambiguity. Altogether, these results corroborate the state-of-the-art diversification performance of xQuAD

    WhizKEY: um ambiente para instalaĆ§Ć£o de bibliotecas digitais

    No full text
    Exportado OPUSMade available in DSpace on 2019-08-12T13:34:26Z (GMT). No. of bitstreams: 1 rodrygoluisteodorosantos.pdf: 1745298 bytes, checksum: bdee38946fb6aaea85ed7477c344f3e1 (MD5) Previous issue date: 19Nesta dissertaĆ§Ć£o, apresentamos WhizKEY -- Wizard-based blocK Ensemble Yielder, um ambiente para assistĆŖncia Ć  instalaĆ§Ć£o de bibliotecas digitais a partir de arcabouƧos de software. WhizKEY foi desenvolvido como um ambiente integrado para a construĆ§Ć£o e personalizaĆ§Ć£o de bibliotecas digitais com base em arcabouƧos de software potencialmente distintos.Do ponto de vista do usuĆ”rio, WhizKEY permite que projetistas de bibliotecas digitais se concentrem nos requisitos do sistema sendo instanciado em vez de se preocuparem com a complexidade da tarefa de instalaĆ§Ć£o subjacente. Para tanto, os projetistas sĆ£o guiados atravĆ©s da tarefa de instalaĆ§Ć£o e cada parĆ¢metro por eles configurado Ć© verificado com base em um conjunto de restriƧƵes definido a fim de garantir a correĆ§Ć£o da instalaĆ§Ć£o produzida.Do ponto de vista do desenvolvedor de um arcabouƧo, WhizKEY implementa a tarefa de instalaĆ§Ć£o como um fluxo, considerando cada passo nesse fluxo como responsĆ”vel pela con\-fi\-gu\-ra\-Ć§Ć£o de instĆ¢ncias de um aspecto de uma biblioteca digital (e.g., a prĆ³pria biblioteca, suas coleƧƵes ou catĆ”logos de metadados e os serviƧos que ela pode oferecer). Esse fluxo Ć© totalmente configurĆ”vel para permitir a instalaĆ§Ć£o de bibliotecas a partir de diferentes arcabouƧos.Por meio de duas sessƵes de testes de usabilidade envolvendo potenciais usuĆ”rios, a efetividade do ambiente WhizKEY no auxĆ­lio Ć  tarefa de instalaĆ§Ć£o foi atestada e uma lista de problemas foi levantada e utilizada no desenvolvimento da versĆ£o atual do ambiente.In this dissertation, we present WhizKEY -- Wizard-based blocK Ensemble Yielder, an environment to assist users in the installation of digital libraries from existing software frameworks. WhizKEY has been designed to provide users with an integrated environment for constructing and personalizing digital libraries from possibly several different software frameworks.From a user's viewpoint, WhizKEY aims at shielding digital library designers from the intricacies of the underlying installation task, allowing them to concentrate on the requirements of the system being instantiated rather than on the installation task itself. For such, designers are guided through the task, and every parameter configured by them is verified against a set of constraints in order to guarantee the correctness of the produced installation.From a framework developer's viewpoint, WhizKEY implements the installation task as a workflow, regarding each step in this workflow as responsible for the configuration of instances of major aspects within a digital library (e.g., the library itself, its collections or metadata catalogs, and the services it may provide). This workflow is fully configurable to enable the installation of digital libraries based on different software frameworks.Through two separate usability test sessions comprising potential WhizKEY users, the effectiveness of the environment with respect to easing the installation task was attested and a list of problems was devised and employed in the development of its current version
    corecore